knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
Mental health challenges among university students are a growing concern, with academic stress often contributing to anxiety and depression. This analysis examines the impact of exam and non-exam periods on SSRI and SNRI prescription rates across Scottish health boards from 2018 to 2023.
Key questions include: - How do prescription rates differ between exam and non-exam periods? - How do demographic factors, such as the percentage of young adults (17–25), influence trends? - Which health boards show the highest changes in prescription rates?
By integrating prescription, demographic, and geographic data, this study highlights seasonal and regional variations in mental health support needs, offering insights for targeted interventions and resource allocation.
library(janitor)
library(here)
library(dbplyr)
library(tidyverse)
Load and clean demographic datasets
HB_names dataset is processed to select only the health
board code and names.population_data is processed to extract the total
population of each health board.age_data is processed to extract the total number of
young adults aged 17 to 25, the typical age range of university
students. The dataset is then joined with the
population_data to create a column of percentage. This
would give a clear view of which health board regions to consider in the
analysis.library(stringr)
# Load and clean Health board names data HB_names
HB_names <- read_csv("C:/Data_Science/B223640/data/HB_names.csv") %>%
clean_names() %>%
select(hb, hb_name) %>%
rename(hbt = hb) %>%
mutate(hb_name = str_remove(hb_name, "^NHS\\s"))
# Load and clean population_data
population_data <- read_csv("C:/Data_Science/B223640/data/UV103_age_health_board_census.csv", skip = 10) %>%
rename(hb_name = "Health Board Area 2019",
hb_population = Count) %>%
filter(Age == "All people" & Sex == "All people") %>%
select(hb_name, hb_population) %>%
mutate(hb_name = str_remove(hb_name, "^NHS\\s")) %>%
arrange(desc(hb_population))
# Load and clean age_data
age_data <- read.csv("C:/Data_Science/B223640/data/UV103_age_health_board_census.csv", skip = 10) %>%
filter(str_detect(Age, "17|18|19|20|21|22|23|24|25")) %>%
rename(hb_name = "Health.Board.Area.2019") %>%
group_by(hb_name) %>%
summarise(totalpeople = sum(Count)) %>%
left_join(population_data, by = "hb_name") %>%
mutate(percentage = totalpeople * 100 / hb_population) %>%
arrange(desc(percentage))
Loading and processing the exam season and non-exam season datasets from 2018 to 2023
The exam season dataset includes:
The non-exam season dataset includes data from:
Processing the exam season and non-exam season data sets
HB_names and
age_data)library(tidyverse)
# Define column_types first to ensure consistent data parsing across all files by explicitly defining how each column should be interpreted
column_types <- cols(
HBT = col_character(),
DMDCode = col_character(),
BNFItemCode = col_character(),
BNFItemDescription = col_character(),
PrescribedType = col_character(),
GPPractice = col_double(),
NumberOfPaidItems = col_double(),
PaidQuantity = col_double(),
GrossIngredientCost = col_double(),
PaidDateMonth = col_double()
)
# Define a reusable function to process datasets
process_exam_nonexam_data <- function(file_paths, hb_names, age_data) {
file_paths %>%
map_dfr(~ read_csv(., col_types = column_types)) %>%
clean_names() %>%
mutate(bnf_item_description = str_remove(bnf_item_description, "[_\\s].*")) %>%
full_join(hb_names, by = c("hbt" = "hbt")) %>%
full_join(age_data, by = "hb_name") %>%
filter(str_detect(bnf_item_description, "ESCITALOPRAM|SERTRALINE|FLUOXETINE|VENLAFAXINE|PAROXETINE|CITALOPRAM")) %>%
filter(hb_name %in% c("Lothian", "Grampian", "Greater Glasgow and Clyde", "Tayside", "Fife", "Forth Valley", "Lanarkshire"))
}
# List all CSV files for exam and non-exam datasets
examfiles <- list.files("C:/Data_Science/B223640/data/examszn_18-23", pattern = "csv", full.names = TRUE)
nonexamfiles <- list.files("C:/Data_Science/B223640/data/nonexamszn_18-23", pattern = "csv", full.names = TRUE)
# Process the exam and non-exam datasets using the function
examszndata <- process_exam_nonexam_data(examfiles, HB_names, age_data)
nonexamszndata <- process_exam_nonexam_data(nonexamfiles, HB_names, age_data)
Steps for Analysis
Compute the prescription rates for SSRIs and SNRIs during both exam and non-exam periods by accounting for the total population in each health board.
Add the percentage of individuals aged 18–25 from the demographic data to assess the influence of young adult populations on prescription trends.
Calculate the percentage change in prescription rates between exam and non-exam periods to identify seasonal differences in mental health needs.
Use pivot_wider() to convert the data from long to
wide format, where each drug becomes a separate column:
Use the gt package to generate a well-formatted
table that summarizes key metrics, including per capita rates,
demographic data, and percentage changes, for clear presentation of
findings.
# Load required libraries
library(tidyverse)
library(gt)
library(dplyr)
# Define a reusable function to process wide-format data
process_wide_data <- function(data) {
data %>%
select(hb_name, bnf_item_description, paid_date_month, hb_population, paid_quantity) %>%
group_by(hb_name, bnf_item_description) %>%
summarise(total_prescriptions = sum(paid_quantity), .groups = "drop") %>%
pivot_wider(
names_from = bnf_item_description,
values_from = total_prescriptions,
values_fill = list(total_prescriptions = 0)
) %>%
rowwise() %>% # Ensure row-wise summation
mutate(total_prescriptions = sum(c_across(CITALOPRAM:VENLAFAXINE), na.rm = TRUE)) %>%
ungroup() %>%
full_join(age_data) %>%
drop_na() # Remove rows with NA values
}
# Process exam and non-exam data using the function
examszndatawide <- process_wide_data(examszndata)
nonexamszndatawide <- process_wide_data(nonexamszndata)
# Add prescription per capita to examszndatawide and nonexamszndatawide
examszndatawide <- examszndatawide %>%
mutate(prescription_per_capita = total_prescriptions / hb_population)
nonexamszndatawide <- nonexamszndatawide %>%
mutate(prescription_per_capita = total_prescriptions / hb_population)
# Combine data for comparison, including the percentage column
combined_prescriptions <- examszndatawide %>%
select(hb_name, prescription_per_capita_exam = prescription_per_capita, percentage) %>%
left_join(
nonexamszndatawide %>%
select(hb_name, prescription_per_capita_non_exam = prescription_per_capita),
by = "hb_name"
)
# Add a change column to show the percentage difference between exam and non-exam seasons
combined_prescriptions <- combined_prescriptions %>%
mutate(
change_percent = ((prescription_per_capita_exam - prescription_per_capita_non_exam) / prescription_per_capita_non_exam) * 100
)
# Create a prettier table with grand_summary_rows
pretty_table <- combined_prescriptions %>%
arrange(desc(prescription_per_capita_exam)) %>%
gt() %>%
tab_header(
title = "Prescription Per Capita and Age Group Percentage",
subtitle = "Exam Season vs Non-Exam Season (2018-2023)"
) %>%
fmt_number(
columns = vars(prescription_per_capita_exam, prescription_per_capita_non_exam, percentage, change_percent),
decimals = 2
) %>%
cols_label(
hb_name = "Health Board",
prescription_per_capita_exam = "Exam Season (Per Capita)",
prescription_per_capita_non_exam = "Non-Exam Season (Per Capita)",
percentage = "17–25 Age Group (%)",
change_percent = "Change (%)"
) %>%
tab_spanner(
label = "Prescription Per Capita",
columns = vars(prescription_per_capita_exam, prescription_per_capita_non_exam, change_percent)
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(everything())
) %>%
tab_style(
style = cell_fill(color = "lightblue"),
locations = cells_body(
columns = vars(change_percent),
rows = change_percent > 0
)
) %>%
tab_style(
style = cell_fill(color = "lightpink"),
locations = cells_body(
columns = vars(change_percent),
rows = change_percent < 0
)
) %>%
grand_summary_rows(
columns = vars(prescription_per_capita_exam, prescription_per_capita_non_exam, percentage),
fns = list(
Avg = ~mean(., na.rm = TRUE)
),
formatter = fmt_number,
decimals = 2
) %>%
tab_source_note(
source_note = "Data sourced from 2018-2023 Prescription Records"
)
# Print the improved table
pretty_table
| Prescription Per Capita and Age Group Percentage | |||||
| Exam Season vs Non-Exam Season (2018-2023) | |||||
| Health Board |
Prescription Per Capita
|
17–25 Age Group (%) | |||
|---|---|---|---|---|---|
| Exam Season (Per Capita) | Non-Exam Season (Per Capita) | Change (%) | |||
| Greater Glasgow and Clyde | 83.41 | 80.25 | 3.93 | 25.06 | |
| Forth Valley | 82.67 | 78.52 | 5.28 | 21.22 | |
| Fife | 80.26 | 74.47 | 7.77 | 21.54 | |
| Lanarkshire | 72.08 | 72.50 | −0.59 | 19.75 | |
| Grampian | 68.26 | 64.58 | 5.70 | 20.83 | |
| Lothian | 64.53 | 61.34 | 5.20 | 25.61 | |
| Tayside | 61.97 | 56.09 | 10.48 | 22.20 | |
| Avg | — | 73.31 | 69.68 | — | 22.32 |
| Data sourced from 2018-2023 Prescription Records | |||||
Key Findings
The table reveals that most health boards show higher prescription rates during exam periods, with an average increase from 69.68 to 73.31 prescriptions per capita. Tayside (+10.48%) and Fife (+7.77%) exhibit the most significant increases, reflecting heightened academic stress. In contrast, Lanarkshire shows a slight decrease (-0.59%), indicating regional variations in mental health needs or prescribing practices.
While regions with higher young adult populations, such as Greater Glasgow and Clyde (25.06%), generally have elevated prescription rates, Lothian (25.61%) does not follow this trend, suggesting alternative support systems or differing prescribing norms.
These findings highlight the need for targeted mental health interventions during exam periods, particularly in regions with significant increases, and further investigation into prescribing disparities to ensure equitable support across Scotland.
Define a Function to Process the Data
The following function processes the exam and non-exam datasets: -
Cleans column names. - Joins with demographic data
(HB_names, population_data, and
age_data). - Filters for SSRIs and SNRIs commonly
prescribed for exam-related stress. - Focuses on specific health boards
with high percentages of young adults. - Calculates total prescriptions
and per-person prescription rates.
# Define a function to process and label the data
process_and_label_data <- function(data, season) {
data %>%
group_by(hb_name, bnf_item_description, totalpeople, percentage, hb_population) %>%
summarise(total_prescription = sum(paid_quantity, na.rm = TRUE), .groups = "drop") %>%
filter(!is.na(bnf_item_description)) %>%
mutate(
Season = season,
per_person = total_prescription / hb_population
)
}
# Process exam and non-exam data
examszndata1 <- process_and_label_data(examszndata, "examseason")
nonexamszndata1 <- process_and_label_data(nonexamszndata, "nonexamseason")
# Combine the datasets and clean
combined_data <- bind_rows(examszndata1, nonexamszndata1) %>%
drop_na(bnf_item_description) # Clean combined dataset
library(forcats)
library(ggplot2)
library(plotly)
# Create the ggplot without text labels
p <- ggplot(combined_data, aes(
x = total_prescription,
y = fct_reorder(bnf_item_description, total_prescription, .fun = sum, .desc = TRUE), # Reorder drugs in descending order
fill = Season
)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.8)) +
labs(
title = "Comparison of Total Prescriptions by Drug",
x = "Total Prescriptions",
y = "Drug",
fill = "Season"
) +
theme_minimal() +
facet_wrap(~fct_reorder(hb_name, total_prescription, .fun = sum, .desc = TRUE), ncol = 2) # Reorder health boards in descending order
# Make the plot interactive with only `total_prescription` and `Season` in the tooltip
interactive_plot <- ggplotly(p, tooltip = c("x", "fill"))
# Display the interactive plot
interactive_plot
Results: Seasonal Variations in SSRI and SNRI Prescriptions Across Scottish Health Boards
Total prescriptions for SSRIs and SNRIs differ between exam and non-exam seasons, with higher rates during exam periods, suggesting increased stress among students.
Greater Glasgow and Clyde and Lothian have the highest total prescriptions, correlating with larger student populations.
Fife, Forth Valley, and Tayside show lower totals but follow similar seasonal patterns.
Sertraline, Fluoxetine, and Citalopram are the most commonly prescribed drugs, while Escitalopram and Paroxetine have lower rates.
Exam seasons show a slight increase in prescriptions, highlighting academic stress as a driver for mental health support.
Merging and Cleaning the Data This code combines the
exam and non-exam datasets (examszndata and
nonexamszndata) into a single dataset,
joinedfulldata. Key steps include:
year and period for
analysis.# Helper function to extract year and assign period
add_year_and_period <- function(data) {
data %>%
mutate(
year = substr(paid_date_month, 1, 4),
period = case_when(
substr(paid_date_month, 5, 6) %in% c("04", "05") ~ "Semester 1 Finals",
substr(paid_date_month, 5, 6) %in% c("11", "12") ~ "Semester 2 Finals",
substr(paid_date_month, 5, 6) %in% c("01", "02") ~ "Semester 2 Start",
substr(paid_date_month, 5, 6) %in% c("09", "10") ~ "Semester 1 Start",
TRUE ~ NA_character_
)
)
}
# Combine exam and non-exam datasets and add year/period columns
joinedfulldata <- examszndata %>%
full_join(nonexamszndata) %>%
add_year_and_period() %>%
group_by(year, period)
The next step summarizes the total quantity of prescriptions paid (paid_quantity) by year and period, and visualizes the data using a bar and line plot.
library(ggplot2)
library(plotly)
summary_totals <- joinedfulldata %>%
group_by(year, period) %>%
summarise(total_paid_quantity = sum(paid_quantity, na.rm = TRUE), .groups = "drop")
# Create the base ggplot with both bar and line layers
p <- ggplot(summary_totals, aes(x = factor(year), y = total_paid_quantity)) +
# Bar chart
geom_bar(aes(fill = factor(year)), stat = "identity", position = "dodge", alpha = 0.7) +
# Line graph
geom_line(aes(group = period), color = "black", size = 1) +
geom_point(aes(group = period), color = "black", size = 2) +
# Add faceting for periods
facet_wrap(~ period) +
labs(
title = "Yearly Comparison of Drug Prescriptions for University Periods",
x = "Year",
y = "Total Paid Quantity",
fill = "Year"
) +
scale_y_continuous(labels = scales::comma, limits = c(0, 32000000)) +
theme_minimal() +
theme(
strip.text = element_text(size = 12, face = "bold"),
axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "none"
)
# Convert to an interactive plot
interactive_plot <- ggplotly(p, tooltip = c("x", "y"))
# Display the interactive plot
interactive_plot
Observations
The visualization shows a consistent increase in prescription totals from 2018 to 2023 across all university periods indicating a growing reliance on SSRIs and SNRIs for mental health support.
Exam Stress: Semester Finals have slightly higher prescription totals than Semester Starts, reflecting increased stress during exams.
Post-2020 Surge: Growth is particularly notable from 2020 onwards, likely linked to the COVID-19 pandemic’s impact on student mental health.
The following code visualizes the yearly trends in total paid quantities of six commonly prescribed SSRIs and SNRIs across four academic periods: Semester 1 Finals, Semester 2 Finals, Semester 1 Start, and Semester 2 Start. The visualization allows comparisons of prescription trends over time for each drug.
library(stringr)
summary_data2 <- joinedfulldata %>%
group_by(year, period, bnf_item_description) %>%
summarise(total_paid_quantity = sum(paid_quantity, na.rm = TRUE), .groups = "drop")
library(ggplot2)
library(dplyr)
library(plotly)
# Create the ggplot2 chart
interactive_plot <- ggplotly(
ggplot(summary_data2, aes(x = year, y = total_paid_quantity, color = period, group = period)) +
geom_line(size = 1) +
geom_point(size = 2) +
facet_wrap(~ bnf_item_description, scales = "free_y", ncol = 2) + # Adjust number of columns for better spacing
labs(
title = "Yearly Trends of Drug Prescriptions",
x = "Year",
y = "Total Paid Quantity",
color = "Academic Period"
) +
theme_minimal() +
theme(
strip.text = element_text(size = 8, face = "bold"),
axis.text.x = element_text(angle = 45, hjust = 1, size = 8),
axis.text.y = element_text(size = 8)
),
tooltip = c("x", "y", "color")
)
interactive_plot
Prescription rates peak during Semester 2 Finals across all drugs, suggesting heightened stress and anxiety due to cumulative academic pressure and the significance of end-of-year exams. Although Semester 1 Finals also see increased prescriptions, they are lower than Semester 2, possibly due to the lesser importance of mid-year exams and additional spring semester stressors like internships.
This trend is evident in all six drugs, particularly Sertraline, Citalopram, and Venlafaxine. Post-2020, the prescription gap between Semester 2 Finals and other periods widens, likely due to increased stress from the COVID-19 pandemic.
Shift in Prescribing Preferences
Although in the bar graph from 4.2 shows a steady increase in the total number of SSRI and SNRI prescriptions, not all 6 drugs show the same trend according to the line graphs in 4.3, this is possibly due a few reasons. The prescribing preferences have shifted towards sertraline and escitalopram due to their better side effect profiles and wider therapeutic applications. These drugs are perceived as safer and more tolerable compared to older medications like fluoxetine, paroxetine, and citalopram. Sertraline and escitalopram are widely used not only for depression but also for anxiety disorders, which has increased their overall demand. As newer research and guidelines emerge, healthcare providers tend to favor these well-studied and safer medications (Carvalho et al., 2016).
Conversely, the use of older SSRIs such as fluoxetine and paroxetine has plateaued or declined. The competition from newer SSRIs and SNRIs has limited the growth of these older medications. Additionally, paroxetine, in particular, has developed an adverse reputation due to its association with withdrawal difficulties and side effects like weight gain, reducing its popularity over time. This dynamic reflects the ongoing evolution of prescribing practices as healthcare providers aim to optimize treatment efficacy and patient safety.(Bourin et al., 2001)
Remarks
The consistently higher prescription rates during Semester 2 Finals underscore the critical mental health challenges faced by students during this period. This finding highlights the importance of targeted interventions, such as counseling, stress management workshops, and peer support programs, during the second semester to address the heightened needs of the student population.
References
Vedhara, K., Hyde, J., Gilchrist, I., Tytherleigh, M., & Plummer, S. (2000). Acute stress, memory, attention and cortisol. Psychoneuroendocrinology, 25(6), 535–549. https://doi.org/10.1016/s0306-4530(00)00008-1
Sheryl Ankrom (2024, August 20). How long does it take for antidepressants to work? Verywell Mind. https://www.verywellmind.com/how-long-does-it-take-for-antidepressants-to-work-2584277
Carvalho, A. F., Sharma, M. S., Brunoni, A. R., Vieta, E., & Fava, G. A. (2016). The Safety, Tolerability and Risks Associated with the Use of Newer Generation Antidepressant Drugs: A Critical Review of the Literature. Psychotherapy and Psychosomatics, 85(5), 270–288. https://doi.org/10.1159/000447034
Bourin, M., Chue, P., & Guillon, Y. (2001). Paroxetine: A review. CNS Drug Reviews, 7(1), 25–47. https://doi.org/10.1111/j.1527-3458.2001.tb00189.x ```